AITopics | tight nonparametric convergence rate

Collaborating Authors

tight nonparametric convergence rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Neural Information Processing SystemsDec-23-2025, 19:47:33 GMT

In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, \Phi(U) \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs~$U$. We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square risk under this model. The convergence of the iterates to the optimum $\theta_*$ and the decay of the generalization error follow polynomial convergence rates with exponents that both depend on the regularities of the optimum $\theta_*$ and of the feature vectors $\Phi(U)$. We interpret our result in the reproducing kernel Hilbert space framework. As a special case, we analyze an online algorithm for estimating a real function on the unit hypercube from the noiseless observation of its value at randomly sampled points; the convergence depends on the Sobolev smoothness of the function and of a chosen kernel. Finally, we apply our analysis beyond the supervised learning setting to obtain convergence rates for the averaging process (a.k.a.

name change, stochastic gradient descent, tight nonparametric convergence rate, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Neural Information Processing SystemsMay-26-2025, 17:27:45 GMT

In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation Y \langle \theta_*, \Phi(U) \rangle between the random output Y and the random feature vector \Phi(U), a potentially non-linear transformation of the inputs U . We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square risk under this model. The convergence of the iterates to the optimum \theta_* and the decay of the generalization error follow polynomial convergence rates with exponents that both depend on the regularities of the optimum \theta_* and of the feature vectors \Phi(U) . We interpret our result in the reproducing kernel Hilbert space framework. As a special case, we analyze an online algorithm for estimating a real function on the unit hypercube from the noiseless observation of its value at randomly sampled points; the convergence depends on the Sobolev smoothness of the function and of a chosen kernel.

artificial intelligence, machine learning, tight nonparametric convergence rate, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Review for NeurIPS paper: Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Neural Information Processing SystemsJan-22-2025, 06:22:48 GMT

Weaknesses: While the paper is written very clearly, there are several questions I'd like to raise. Firstly, in discussing the applicability of the results the paper mentions'some basic vision or sound recognition tasks' (line 33) - I'd like to ask about some examples of such tasks. Looking at the statement of the Theorem 1, seems that it should be applicable in finite-dimensional spaces with invertible covariance matrices. If it is so, then I do not understand the results. In particular, for X distributed with a finite support and has identity covariance matrix, the conditions (a) and (b) hold for arbitrarily large positive \alpha, however the theorem statement implies that the estimates will go to zero at an arbitrarily large polynomial rate, which is not true.

noiseless linear model, stochastic gradient descent, tight nonparametric convergence rate, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Review for NeurIPS paper: Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Neural Information Processing SystemsJan-22-2025, 06:22:40 GMT

The submission considers noiseless (/low noise) linear regression with non-linear transformation of the input data, and show that under this setting, SGD achieves faster convergence rates. This is a very nice contribution with applicability to important problems as mentioned by the authors in their feedback. We urge the authors to incorporate the points they made in response to the reviews.

artificial intelligence, machine learning, tight nonparametric convergence rate, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Neural Information Processing SystemsOct-9-2024, 16:09:06 GMT

noiseless linear model, stochastic gradient descent, tight nonparametric convergence rate, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback